Model-based clustering represents nowadays a popular tool of analysis thanks to its probabilistic foundations and its great flexibility. To deal with multivariate longitudinal sequences, standard approaches need to be extended to accommodate the peculiarities of such kind of data. These come in the form of three-way data: the first dimension identifies individuals, the second dimension identifies variables, the third one identifies time occasions. A method for simultaneous clustering of subjects and multivariate outcomes repeatedly recorded over time is proposed. In particular, a finite mixture of generalized linear models is considered to cluster individuals; within each component of the finite mixture, a flexible and parsimonious parameterization of the corresponding canonical parameter is adopted to identify clusters of outcomes evolving in a similar manner across time. This allows us to obtain clusters of individuals that share common trajectories for one of more outcomes over time and, consequently, a dimensionality reduction on the first two dimensions of three-way data structure. Parameter estimates are derived within a maximum likelihood framework, by considering an indirect approach based on an extended expectation-maximization algorithm.
Model-based biclustering of multivariate longitudinal trajectories / Alfò, M.; Marino, M. F.; Martella, F.. - (2019), pp. 34-34. (Intervento presentato al convegno Final CRoNoS meeting and 2nd workshop and training school on Multivariate Data Analysis and Software (CRoNoS & MDA 2019) tenutosi a Limassol (Cypro)).
Model-based biclustering of multivariate longitudinal trajectories
M. Alfò;M. F. Marino
;F. Martella
2019
Abstract
Model-based clustering represents nowadays a popular tool of analysis thanks to its probabilistic foundations and its great flexibility. To deal with multivariate longitudinal sequences, standard approaches need to be extended to accommodate the peculiarities of such kind of data. These come in the form of three-way data: the first dimension identifies individuals, the second dimension identifies variables, the third one identifies time occasions. A method for simultaneous clustering of subjects and multivariate outcomes repeatedly recorded over time is proposed. In particular, a finite mixture of generalized linear models is considered to cluster individuals; within each component of the finite mixture, a flexible and parsimonious parameterization of the corresponding canonical parameter is adopted to identify clusters of outcomes evolving in a similar manner across time. This allows us to obtain clusters of individuals that share common trajectories for one of more outcomes over time and, consequently, a dimensionality reduction on the first two dimensions of three-way data structure. Parameter estimates are derived within a maximum likelihood framework, by considering an indirect approach based on an extended expectation-maximization algorithm.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.